{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Getting started with HTMD"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "Assuming that you have already downloaded and installed **HTMD**, this tutorial introduces you to the software, specially into the `Molecule` class, whose features serve as a good introduction for the more complex features of HTMD."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Let's get started! The first thing you will have to get familiar with in HTMD is the [Molecule](https://www.htmd.org/docs/latest/moleculekit.molecule.html) class."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "``Molecule`` objects:\n",
    "* Store structural information on molecules.\n",
    "* do __not only__ contain a single molecule.\n",
    "* can contain a whole system including water, ions, proteins, ligands, lipids etc., in a similar way to [VMD](http://www.ks.uiuc.edu/Research/vmd/) (a visualization software we also use in HTMD)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "Think of these objects as containers of structural information."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "First, we need to import HTMD, so that any class and function defined by HTMD is available in the workspace."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "In HTMD, there are several submodules, and an easier way to import the most important ones is to use the following:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Please cite HTMD: Doerr et al.(2016)JCTC,12,1845. \n",
      "https://dx.doi.org/10.1021/acs.jctc.6b00049\n",
      "Documentation: http://software.acellera.com/\n",
      "To update: conda update htmd -c acellera -c psi4\n",
      "\n",
      "You are on the latest HTMD version (unpackaged : /home/joao/maindisk/software/repos/Acellera/htmd/htmd).\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from htmd.ui import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Reading files"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The [Molecule](https://www.htmd.org/docs/latest/moleculekit.molecule.html) class provides file readers for various structure formats like PDB, PRMTOP, PSF, GRO, MOL2, MAE and more. It is also able to read various MD trajectory and coordinate formats including XTC, DCD, COOR, CRD, TRR, XYZ etc.\n",
    "The method for reading files is [Molecule.read()](https://www.htmd.org/docs/latest/moleculekit.molecule.html#moleculekit.molecule.Molecule.read), however you can also specify the file name in the class constructor and it will automatically call `read()`.\n",
    "Let's see an example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2018-03-08 13:39:38,044 - htmd.molecule.readers - INFO - Using local copy for 3PTB: /home/joao/maindisk/software/repos/Acellera/htmd/htmd/data/pdb/3ptb.pdb\n",
      "2018-03-08 13:39:38,221 - moleculekit.molecule - WARNING - Residue insertions were detected in the Molecule. It is recommended to renumber the residues using the Molecule.renumberResidues() method.\n"
     ]
    }
   ],
   "source": [
    "mol = Molecule('3PTB')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "or just use a local file:\n",
    "```python\n",
    "mol = Molecule('yourprotein.pdb')\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "PDB files contain both atom information and coordinates. Some other formats separate the atom information from the coordinates. \n",
    "In that case you can start for example by reading atom information from a PSF file and then read atom coordinates using the `read` method of `Molecule` as in the next example.\n",
    "You could also read them in reverse order, creating the `Molecule` using the XTC and then reading the PSF (it would not matter).\n",
    "```python\n",
    "    mol = Molecule('yourstructure.psf')\n",
    "    mol.read('yourtrajectory.xtc')\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2016-10-10T13:53:29.811236",
     "start_time": "2016-10-10T13:53:29.806944"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Writing files\n",
    "\n",
    "The `Molecule` class also provides file writers for multiple formats using the [Molecule.write()](https://www.htmd.org/docs/latest/moleculekit.molecule.html#moleculekit.molecule.Molecule.write) method.\n",
    "```python\n",
    "mol.write('yourtrajectory.dcd')\n",
    "mol.write('yourstructure.prmtop')\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Looking inside a Molecule"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "Printing the `Molecule` object shows its properties:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Molecule with 1701 atoms and 1 frames\n",
      "Atom field - altloc shape: (1701,)\n",
      "Atom field - atomtype shape: (1701,)\n",
      "Atom field - beta shape: (1701,)\n",
      "Atom field - chain shape: (1701,)\n",
      "Atom field - charge shape: (1701,)\n",
      "Atom field - coords shape: (1701, 3, 1)\n",
      "Atom field - element shape: (1701,)\n",
      "Atom field - insertion shape: (1701,)\n",
      "Atom field - masses shape: (1701,)\n",
      "Atom field - name shape: (1701,)\n",
      "Atom field - occupancy shape: (1701,)\n",
      "Atom field - record shape: (1701,)\n",
      "Atom field - resid shape: (1701,)\n",
      "Atom field - resname shape: (1701,)\n",
      "Atom field - segid shape: (1701,)\n",
      "Atom field - serial shape: (1701,)\n",
      "angles shape: (0, 3)\n",
      "bonds shape: (42, 2)\n",
      "bondtype shape: (42,)\n",
      "box shape: (3, 1)\n",
      "boxangles shape: (3, 1)\n",
      "crystalinfo: {'a': 54.890000000000001, 'b': 58.520000000000003, 'c': 67.629999999999995, 'alpha': 90.0, 'beta': 90.0, 'gamma': 90.0, 'sGroup': ['P', '21', '21', '21'], 'z': 4, 'numcopies': 4, 'rotations': array([[[ 1.,  0.,  0.],\n",
      "        [ 0.,  1.,  0.],\n",
      "        [ 0.,  0.,  1.]],\n",
      "\n",
      "       [[-1.,  0.,  0.],\n",
      "        [ 0., -1.,  0.],\n",
      "        [ 0.,  0.,  1.]],\n",
      "\n",
      "       [[-1.,  0.,  0.],\n",
      "        [ 0.,  1.,  0.],\n",
      "        [ 0.,  0., -1.]],\n",
      "\n",
      "       [[ 1.,  0.,  0.],\n",
      "        [ 0., -1.,  0.],\n",
      "        [ 0.,  0., -1.]]]), 'translations': array([[  0.   ,   0.   ,   0.   ],\n",
      "       [ 27.445,   0.   ,  33.815],\n",
      "       [  0.   ,  29.26 ,  33.815],\n",
      "       [ 27.445,  29.26 ,   0.   ]])}\n",
      "dihedrals shape: (0, 4)\n",
      "fileloc shape: (1, 2)\n",
      "impropers shape: (0, 4)\n",
      "reps: \n",
      "ssbonds shape: (0,)\n",
      "step shape: (1,)\n",
      "time shape: (1,)\n",
      "topoloc: 3PTB\n",
      "viewname: 3PTB\n"
     ]
    }
   ],
   "source": [
    "print(mol)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Properties and methods of `Molecule` objects"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Each `Molecule` object has a number of **properties** (data associated to the molecule) and **methods** (operations that you can perform on the molecule). Some of the properties correspond to data which is usually found in PDB files.\n",
    "\n",
    "| Properties |Methods      |\n",
    "|:----------:|:-----------:|\n",
    "|record      |read( )      |\n",
    "|serial      |write( )     |\n",
    "|name        |get( )       |\n",
    "|resname     |set( )       |\n",
    "|chain       |atomselect( )|\n",
    "|resid       |copy( )      |\n",
    "|segid       |filter( )    |\n",
    "|coords      |append( )    |\n",
    "|box         |insert( )    |\n",
    "|reps        |view( )      |\n",
    "|...         |moveBy( )    |\n",
    "|            |rotateBy( )  |\n",
    "|            |...          |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Properties can be accessed,\n",
    "\n",
    "* either directly:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([   1,    2,    3, ..., 1700, 1701, 1702])"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mol.serial"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "or,\n",
    "\n",
    "* via the `Molecule.get` method:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([   1,    2,    3, ..., 1700, 1701, 1702])"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mol.get('serial')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "To get help on a particular method of the `Molecule()` class, one can do:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Help on function get in module moleculekit.molecule:\n",
      "\n",
      "get(self, field, sel=None)\n",
      "    Retrieve a specific PDB field based on the selection\n",
      "    \n",
      "    Parameters\n",
      "    ----------\n",
      "    field : str\n",
      "        The PDB field we want to get\n",
      "    sel : str\n",
      "        Atom selection string for which atoms we want to get the field from. Default all.\n",
      "        See more `here <http://www.ks.uiuc.edu/Research/vmd/vmd-1.9.2/ug/node89.html>`__\n",
      "    \n",
      "    Returns\n",
      "    ------\n",
      "    vals : np.ndarray\n",
      "        Array of values of `field` for all atoms in the selection.\n",
      "    \n",
      "    Examples\n",
      "    --------\n",
      "    >>> mol=tryp.copy()\n",
      "    >>> mol.get('resname')\n",
      "    array(['ILE', 'ILE', 'ILE', ..., 'HOH', 'HOH', 'HOH'], dtype=object)\n",
      "    >>> mol.get('resname', sel='resid 158')\n",
      "    array(['LEU', 'LEU', 'LEU', 'LEU', 'LEU', 'LEU', 'LEU', 'LEU'], dtype=object)\n",
      "\n"
     ]
    }
   ],
   "source": [
    "help(Molecule.get)"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  },
  "livereveal": {
   "scroll": true
  },
  "nav_menu": {},
  "toc": {
   "navigate_menu": true,
   "number_sections": true,
   "sideBar": true,
   "threshold": 6,
   "toc_cell": false,
   "toc_section_display": "block",
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
